EpiCompare compares different epigenetic datasets for benchmarking and quality control purposes. The report consists of three main sections:

  1. General Metrics: Metrics on fragments (duplication rate) and peaks (blacklisted peaks and widths)
  2. Peak Overlaps: Percentage and statistical significance of overlapping and unique peaks
  3. Functional Annotation: Functional annotation (ChromHMM, motif and enrich) of peaks.

Input peak files. Total of 2 samples:

## [1] "File1: CnR"
## [1] "File2: CnT"


1. General Metrics


Fragment Information

This information is displayed only if summary metrics from Picard is provided. See help manual.

  • Mapped_Fragments: Number of mapped read pairs in the file.
  • Duplication_Rate: Percentage of mapped sequence that is marked as duplicate.
  • Unique_Fragments: Number of mapped sequence that is not marked as duplicate.


Peak Information

  • PeakN before tidy: Total number of peaks including those blacklisted and those in non-standard chromosomes.
  • Blacklisted peaks removed (%): Percentage of blacklisted peaks present in the sample (blacklist).ENCODE blacklist includes regions of the genome that have anomalous and/or unstructured signals independent of the cell-line or experiment.
  • Non-standard peaks removes (%): Percentage of peaks identified in non-standard and mitochondrial chromosomes.
  • PeakN after tidy: Total number of peaks after removing blacklisted peaks and those in non-standard chromosomes.


Sample PeakN before tidy Blacklisted peaks removed (%) Non-standard peaks removed (%) PeakN after tidy
CnR 2707 6.10 0 2542
CnT 1670 5.39 0 1580

Peak widths

Distribution of peak widths in each sample after tidying.



2. Peak Overlaps

Individual samples

Heatmap of percentage overlaps between input peak files. Hover over the heatmap for percentage values.

Significance of overlapping vs unique peaks

The plot is displayed only if a reference peak file is provided and stat_plot = TRUE. Depending on the format of the reference file, the function output different plots:

  • If the reference file has BED6+4 format (peak called with MACS2), the plot is a paired boxplot showing a distribution of -log10(q-value) for overlapping and unique peaks per sample.
  • If the reference file does not have BED6+4 format, it generates a barplot of percentage overlap per sample, coloured by adjusted p-value.


3. Functional Annotation

3.1 ChromHMM Annotation

ChromHMM annotates different chromatin states (ChromHMM). The annotation used were obtained from here.

All samples

ChromHMM annotation of individual peak files.

Overlap: Sample peaks in Reference peaks

ChromHMM annotation of sample peaks found in reference peaks.

Overlap: Reference peaks in Sample peaks

ChromHMM annotation of reference peaks found in sample peaks.

Unique: Sample peaks not in Reference peaks

ChromHMM annotation of sample peaks not found in reference peaks.

Unique: Reference peaks not in Sample peaks

ChromHMM annotation of reference peaks not found in sample peaks.

3.2 ChIPseeker Annotation

3.3 Enrichment analysis

KEGG Enrichment Analysis

GO Enrichment Analysis

3.4 Motif analysis